Take-home Exercise 1 - Geospatial Analytics for Public Good

Author

Flora Phyo Zin Htet

Overview

As city-wide urban infrastructures such as buses, taxis, mass rapid transit, public utilities and roads become digital, the datasets obtained can be used as a framework for tracking movement patterns through space and time. This is particularly true with the recent trend of massive deployment of pervasive computing technologies such as GPS and RFID on the vehicles. For example, routes and ridership data were collected with the use of smart cards and Global Positioning System (GPS) devices available on the public buses. These massive movement data collected are likely to contain structure and patterns that provide useful information about characteristics of the measured phenomena. The identification, analysis and comparison of such patterns will provide greater insights on human movement and behaviours within a city. These understandings will potentially contribute to a better urban management and useful information for urban transport services providers both from the private and public sector to formulate informed decision to gain competitive advantage.

In real-world practices, the use of these massive locational aware data, however, tend to be confined to simple tracking and mapping with GIS applications. This is mainly due to a general lack of functions in conventional GIS which is capable of analysing and model spatial and spatio-temporal data effectively.

Getting Started

Installing and Loading the R Packages

The code chunk below install and load sf, sfdep, spdep, tmap, tidyverse, dplyr and mapview packages into R environment

pacman::p_load(sf, sfdep, spdep, tmap, tidyverse)

Data Preparation

Dataset used in this assignment are

Geospatial data : Bus Stop Location from LTA DataMall. It provides information about all the bus stops currently being serviced by buses, including the bus stop code (identifier) and location coordinates.

Aspatial data : Passenger Volume by Origin Destination Bus Stops for Oct 2023 downloaded from LTA DataMall.

Geospatial data

Import Bus Stop Geospatial Data into R environment

Load the Bus Stop GIS by using st_read() from sf package.

bs = st_read(dsn="data/geospatial", layer="BusStop")
Reading layer `BusStop' from data source 
  `D:\zhphyo\ISSS624\Take-home_Ex1\data\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21

Data Wrangling

Clean the data to ensure the accuracy before conducting our analysis.

Check duplicate geometry

Below code chunk converts the geometry to text, identifies the rows with duplicate geometries. We use the st_as_text() from sf package to convert the geometry object to their WKT(Well-Known Text) string representation. We will then used the duplicated() function get the duplicate records with same geometry in bs data frame.

From the result below, we identified total 2 records that is duplicates.

# Calculate WKT for each geometry
bs <- bs %>% 
  mutate(temp_geo = st_as_text(geometry))

# Find duplicate geometries
duplicate_geometries <- bs %>% 
  filter(duplicated(temp_geo) | duplicated(temp_geo, fromLast = TRUE))

# Print duplicate geometries
print(duplicate_geometries)
Simple feature collection with 2 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 42187.23 ymin: 34995.78 xmax: 42187.23 ymax: 34995.78
Projected CRS: SVY21
  BUS_STOP_N BUS_ROOF_N        LOC_DESC                  geometry
1      96319       <NA> Yusen Logistics POINT (42187.23 34995.78)
2      96319        NIL YUSEN LOGISTICS POINT (42187.23 34995.78)
                   temp_geo
1 POINT (42187.23 34995.78)
2 POINT (42187.23 34995.78)

We will now use the filter() function dplyr package to filter rows in bs data frame to keep non duplicate records. Lastly, remove temp_geo column from bs data frame which is the temporary column that used to check the duplicate geometry value.

# Remove duplicate geometries
bs <- bs %>% 
  filter(!duplicated(temp_geo))

#remove the temp column if not needed
bs <- bs %>% select(-temp_geo)

Check duplicate Bus Stop No

Below code chunk identifies the rows with duplicate BUS_STOP_N. We will then used the duplicated() function get the duplicate records with same BUS_STOP_N in bs data frame.

From the result below, we identified 30 records that is duplicates.

# Calculate WKT for each geometry
bs <- bs %>% 
  mutate(temp_bs = BUS_STOP_N)

# Find duplicate geometries
duplicate_bs_n <- bs %>% 
  filter(duplicated(temp_bs) | duplicated(temp_bs, fromLast = TRUE))

# Print duplicate geometries
print(duplicate_bs_n)
Simple feature collection with 30 features and 4 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 13488.02 ymin: 32594.17 xmax: 44144.57 ymax: 47934
Projected CRS: SVY21
First 10 features:
   BUS_STOP_N BUS_ROOF_N            LOC_DESC                  geometry temp_bs
1       58031        UNK     OPP CANBERRA DR  POINT (27089.69 47570.9)   58031
2       62251        B03        Bef Blk 471B POINT (35500.54 39943.41)   62251
3       47201        UNK                <NA> POINT (22616.75 47793.68)   47201
4       58031        UNK     OPP CANBERRA DR POINT (27111.07 47517.77)   58031
5       22501        B02            Blk 662A  POINT (13489.09 35536.4)   22501
6       82221        B01              BLK 3A  POINT (35323.6 33257.05)   82221
7       68091        B01        AFT BAKER ST POINT (32164.11 42695.98)   68091
8       43709        B06             BLK 644  POINT (18963.42 36762.8)   43709
9       97079        B14 OPP ST. JOHN'S CRES POINT (44144.57 38980.25)   97079
10      82221        B01              Blk 3A POINT (35308.74 33335.17)   82221

We will now use the filter() function dplyr package to filter rows in bs data frame to keep non duplicate records. Lastly, remove temp_bs column from bs data frame which is the temporary column that used to check the duplicate BUS_STOP_N value.

# Remove duplicate geometries
bs <- bs %>% 
  filter(!duplicated(temp_bs))

#remove the temp column if not needed
bs <- bs %>% select(-temp_bs)

Transform Data

Transform the dataset to CRS 3414 and extract X and Y coordinates from geometry. Join the X,Y coordinates to bs data set.

bs = st_transform(bs, 3414)

bs_XY = do.call(rbind, st_geometry(bs)) %>% 
    as_tibble() %>% setNames(c("X","Y"))
Warning: The `x` argument of `as_tibble.matrix()` must have unique column names if
`.name_repair` is omitted as of tibble 2.0.0.
ℹ Using compatibility `.name_repair`.
bs = cbind(bs, bs_XY)

Use st_crs() function from sf package to validate and obtain the information about coordinate system associated with bs object.

st_crs(bs)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

Before starting to analyse data, use glimpse() to look at certain attributes of the spatial features to gain understanding of our dataset.

glimpse(bs)
Rows: 5,145
Columns: 6
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC   <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ X          <dbl> 13576.312, 13228.592, 21045.101, 41603.764, 24568.738, 3095…
$ Y          <dbl> 32883.65, 44206.38, 40242.08, 35413.11, 30391.85, 38079.61,…
$ geometry   <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…

Aspatial data

Import Passenger Volume csv file into R environment

Use Postman download Passenger Volume by Origin Destination Bus Stops downloaded from LTA DataMall.

Import origin_destination_bus_202310.csv into R by using read_csv() of readr package. The output is R data frame class, pv.

pv <- read_csv("data/aspatial/origin_destination_bus_202310.csv")
Rows: 5694297 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): YEAR_MONTH, DAY_TYPE, PT_TYPE, ORIGIN_PT_CODE, DESTINATION_PT_CODE
dbl (2): TIME_PER_HOUR, TOTAL_TRIPS

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Use head() function to print first few rows of the data frame to quickly inspect the dataset’s structure and content.

head(pv)
# A tibble: 6 × 7
  YEAR_MONTH DAY_TYPE   TIME_PER_HOUR PT_TYPE ORIGIN_PT_CODE DESTINATION_PT_CODE
  <chr>      <chr>              <dbl> <chr>   <chr>          <chr>              
1 2023-10    WEEKENDS/…            16 BUS     04168          10051              
2 2023-10    WEEKDAY               16 BUS     04168          10051              
3 2023-10    WEEKENDS/…            14 BUS     80119          90079              
4 2023-10    WEEKDAY               14 BUS     80119          90079              
5 2023-10    WEEKDAY               17 BUS     44069          17229              
6 2023-10    WEEKENDS/…            17 BUS     20281          20141              
# ℹ 1 more variable: TOTAL_TRIPS <dbl>

Function to Perform relational join

Write a function to join Bus Stop (bs) with the attribute fields of Passenger Volume (pv) dataset. This is performed by using left_join() of dplyr package. This function will also convert the data into spatial object using st_as_sf() of sf package and filter out the records where the columns ‘X’, ‘Y’ and ‘ORIGIN_TOTAL_TRIPS’ doesn’t contain any value (NA).

This function will be used in the [Geovisualisation and Analysis] section.

joined_bs_pv <- function(bs, pv) {
  joined_data <- left_join(bs, pv, by = c("BUS_STOP_N" = "ORIGIN_PT_CODE")) %>%
    select(BUS_STOP_N, X, Y, ORIGIN_TOTAL_TRIPS, geometry)
  
  sf = joined_data %>%
    # lng/lat value are missing in some records
    filter(!is.na(X) & !is.na(Y) & !is.na(ORIGIN_TOTAL_TRIPS)) %>%
    st_as_sf(coords = c("X", "Y"), crs = 3414, remove = FALSE)
  
  return (sf)
}

Create and plot spatial hexagon grids

Function to Create hexagon grid

Write a function to create hexagonal grid over a given set of spatial points - input, counts and sums trip data within each grid cell, and then filters out cells with no or zero trips. It’s used for [Geovisualisation and Analysis] section to understand on trip distribution.

honeycomb_grid <- function(input) {
  #Create a grid which the extent equals to the bounding box of the selected points
  area_honeycomb_grid <- st_make_grid(input, c(500), what = "polygons", square = FALSE)
  
  # To sf and add grid ID
  area_honeycomb_grid_sf <- st_sf(area_honeycomb_grid) %>%
    # add grid ID
    mutate(grid_id = 1:length(lengths(area_honeycomb_grid)))
  
  #Sum the number of passenger points in each grid
  count_points <- st_intersection(area_honeycomb_grid_sf, input) %>%
    group_by(grid_id) %>%
    summarize(sum_trips = sum(ORIGIN_TOTAL_TRIPS, na.rm = TRUE)) %>%
    rename(count_grid_id = grid_id)
  
  #Merge the point counts back to the honeycomb grid
  area_honeycomb_grid_sf <- area_honeycomb_grid_sf %>%
    st_join(count_points, by = "grid_id")
  
  # Remove grid without value 
  filtered_area_honeycomb_grid_sf <- area_honeycomb_grid_sf %>%
    filter(!is.na(sum_trips) & sum_trips > 0) %>%
    select(-count_grid_id)
  
  filtered_area_honeycomb_grid_sf <- filtered_area_honeycomb_grid_sf %>%
  rename(geometry=area_honeycomb_grid)
  
  return (filtered_area_honeycomb_grid_sf)
}

Function to Plot hexagon grid

This code chunk will plot the grid into a interactive map with tmap. This function will use in [Geovisualisation and Analysis] section to plot a map with hexagon grid.

plot_honeycomb_grid_with_trips <- function(input) {
  tmap_mode("view")
 
  # Plot the honeycomb grid with the summed trips
  tm_shape(input) +
    tm_fill(
      col = "sum_trips",
      palette = "Reds",
      style = "fixed",
      breaks = c(1, 1000, 5000, 10000, 50000, 100000, 300000, 600000),
      title = "Total Trips",
      id = "grid_id",
      showNA = FALSE,
      alpha = 0.6,
      popup.vars = c(
        "Total Trips: " = "sum_trips"
      ),
      popup.format = list(
        sum_trips = list(format = "f", digits = 0)
      )
    ) +
    tm_borders(col = "grey40", lwd = 0.7)
  }

Geovisualisation and Analysis

In this section we will use the maps and spatial data tools to uncover patterns, trends, and insights with the reference to the time intervals provided in the table below, compute the passenger trips generated by origin at the hexagon level.

Peak hour period Bus tap on time
Weekday morning peak 6am to 9am
Weekday afternoon peak 5pm to 8pm
Weekend/holiday morning peak 11am to 2pm
Weekend/holiday evening peak 4pm to 7pm

Analysis

Weekday morning peak

Filter data

Below chunk of code creates data set pv_wkday_6to9_202310 by filtering the data from the pv dataset for weekday morning peak between 6AM and 9AM, and computes the total passenger trips at origin for bus stop, utilizing the columns - YEAR_MONTH, DAY_TYPE, and ORIGIN_PT_CODE as grouping criteria.

pv_wkday_6to9_202310 <- pv %>%
  group_by(YEAR_MONTH, DAY_TYPE, ORIGIN_PT_CODE) %>%
  filter(YEAR_MONTH == "2023-10" & DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9) %>%
  summarise(ORIGIN_TOTAL_TRIPS = sum(TOTAL_TRIPS, na.rm = TRUE))
`summarise()` has grouped output by 'YEAR_MONTH', 'DAY_TYPE'. You can override
using the `.groups` argument.

Performing relational join

Call function joined_bs_pv to join Bus Stop (bs) with the attribute fields of Passenger Volume (pv) dataset for weekday morning peak.

wkday_6to9_202310_sf <- joined_bs_pv (bs, pv_wkday_6to9_202310)

Create hexagon grid

Call function honeycomb_grid() to create hexagonal grid over weekday morning peak

wkday_6to9_trips <- honeycomb_grid(wkday_6to9_202310_sf) 
Warning: attribute variables are assumed to be spatially constant throughout
all geometries

Plot hexagon grid

Call function plot_honeycomb_grid_with_trips() to plot interactive map for weekday morning peak.

plot_honeycomb_grid_with_trips(wkday_6to9_trips) 
tmap mode set to interactive viewing

Analysis

In suburban areas like Woodlands/Yishun, Choa Chu Kang, and Jurong, it is quite common to observe significant passenger trip volumes in the morning rush, ranging from 100,000 to 300,000. In addition, the Punggol/Sengkang area also has substantial amounts of traffic ranging from 50,000 to 300,000 trips per day, which indicates that there is a substantial commuter population. Due to this, it is clear that commuters are present during rush hour during the morning, and thus there is a need to enhance transportation services and manage congestion in these areas.

Weekday afternoon peak

Filter data

Below chunk of code creates data set pv_wkday_5to8_202310 by filtering the data from the pv dataset for weekday afternoon peak between 5PM to 8PM, and computes the total passenger trips at origin for bus stop, utilizing the columns - YEAR_MONTH, DAY_TYPE, and ORIGIN_PT_CODE as grouping criteria.

pv_wkday_5to8_202310 <- pv %>%
  group_by(YEAR_MONTH, DAY_TYPE, ORIGIN_PT_CODE) %>%
  filter(YEAR_MONTH == "2023-10" & DAY_TYPE == "WEEKDAY" & TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20) %>%
  summarise(ORIGIN_TOTAL_TRIPS = sum(TOTAL_TRIPS, na.rm = TRUE))
`summarise()` has grouped output by 'YEAR_MONTH', 'DAY_TYPE'. You can override
using the `.groups` argument.

Performing relational join

Call function joined_bs_pv to join Bus Stop (bs) with the attribute fields of Passenger Volume (pv) dataset for weekday afternoon peak.

wkday_5to8_202310_sf <- joined_bs_pv (bs, pv_wkday_5to8_202310)

Create hexagon grid

Call function honeycomb_grid() to create hexagonal grid over weekday afternoon peak

wkday_5to8_trips <- honeycomb_grid(wkday_5to8_202310_sf) 
Warning: attribute variables are assumed to be spatially constant throughout
all geometries

Plot hexagon grid

Call function plot_honeycomb_grid_with_trips() to plot interactive map for weekday afternoon peak.

plot_honeycomb_grid_with_trips(wkday_5to8_trips)
tmap mode set to interactive viewing

Analysis

While trip volumes are lower compared to the morning peak, select residential areas still experience 50,000 to 300,000 trips, indicating the presence of evening commuters. Analyzing the reasons behind evening commuting, such as flexible work hours, can help optimize transit schedules and infrastructure.

Weekend/holiday morning peak

Filter data

Below chunk of code creates data set pv_wkend_11to2_202310 by filtering the data from the pv dataset for weekend/holiday morning peak between 11AM and 2PM, and computes the total passenger trips at origin for bus stop, utilizing the columns - YEAR_MONTH, DAY_TYPE, and ORIGIN_PT_CODE as grouping criteria.

pv_wkend_11to2_202310 <- pv %>%
  group_by(YEAR_MONTH, DAY_TYPE, ORIGIN_PT_CODE) %>%
  filter(YEAR_MONTH == "2023-10" & DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14) %>%
  summarise(ORIGIN_TOTAL_TRIPS = sum(TOTAL_TRIPS, na.rm = TRUE))
`summarise()` has grouped output by 'YEAR_MONTH', 'DAY_TYPE'. You can override
using the `.groups` argument.

Performing relational join

Call function joined_bs_pv to join Bus Stop (bs) with the attribute fields of Passenger Volume (pv) dataset for weekend/holiday morning peak.

wkend_11to2_202310_sf <- joined_bs_pv (bs, pv_wkend_11to2_202310)

Create hexagon grid

Call function honeycomb_grid() to create hexagonal grid over weekend/holiday morning peak.

wkend_11to2_trips <- honeycomb_grid(wkend_11to2_202310_sf) 
Warning: attribute variables are assumed to be spatially constant throughout
all geometries

Plot hexagon grid

Call function plot_honeycomb_grid_with_trips() to plot interactive map for weekend/holiday morning peak.

plot_honeycomb_grid_with_trips(wkend_11to2_trips) 
tmap mode set to interactive viewing

Analysis

There is a consistent number of trips generated during weekend mornings in residential areas, ranging from 5,000 to 50,000, which is a sign that residents and visitors are engaged in leisure activities or on family outings at this time. In order to meet the needs of a leisure-oriented crowd in these regions, it is imperative that recreational facilities and transit options be optimized.

Weekend/holiday evening peak

Filter data

Below chunk of code creates data set pv_wkend_4to7_202310 by filtering the data from the pv dataset for weekend/holiday evening peak between 4PM and 7PM, and computes the total passenger trips at origin for bus stop, utilizing the columns - YEAR_MONTH, DAY_TYPE, and ORIGIN_PT_CODE as grouping criteria.

pv_wkend_4to7_202310 <- pv %>%
  group_by(YEAR_MONTH, DAY_TYPE, ORIGIN_PT_CODE) %>%
  filter(YEAR_MONTH == "2023-10" & DAY_TYPE == "WEEKENDS/HOLIDAY" & TIME_PER_HOUR >= 16 & TIME_PER_HOUR <= 19) %>%
  summarise(ORIGIN_TOTAL_TRIPS = sum(TOTAL_TRIPS, na.rm = TRUE))
`summarise()` has grouped output by 'YEAR_MONTH', 'DAY_TYPE'. You can override
using the `.groups` argument.

Performing relational join

Call function joined_bs_pv to join Bus Stop (bs) with the attribute fields of Passenger Volume (pv) dataset for weekday/holiday afternoon peak.

wkend_4to7_202310_sf <- joined_bs_pv (bs, pv_wkend_4to7_202310)

Create hexagon grid

Call function honeycomb_grid() to create hexagonal grid over weekday/holiday afternoon peak

wkend_4to7_trips <- honeycomb_grid(wkend_4to7_202310_sf) 
Warning: attribute variables are assumed to be spatially constant throughout
all geometries

Plot hexagon grid

Call function plot_honeycomb_grid_with_trips() to plot interactive map for weekday/holiday afternoon peak.

plot_honeycomb_grid_with_trips(wkend_4to7_trips) 
tmap mode set to interactive viewing

Analysis

During the evening peak on weekends and holidays, trip volumes between 5,000 and 50,000 indicate ongoing social or leisure activities in residential areas. It’s vital to have adequate parking and public transportation, especially in popular entertainment districts at night. When planning recreational facilities, it’s helpful to know what kind of activities and attractions draw visitors during these times.

Observation

Passenger trip patterns during both weekday morning and afternoon underscore the importance of commuter-centric transportation. In the morning peak, commuters are concentrated in suburban areas, so transportation services need to be improved. The afternoon peak indicates evening commuters, suggesting transit schedules be optimized to accommodate flexible work hours. In order to meet commuters’ diverse needs throughout the day, a well-adapted and efficient transportation system is crucial.

We can observe that for Weekend/Holiday mornings and evenings, trip volumes are more evenly distributed across residential areas, suggesting leisure activities or family outings. These patterns call for optimizing recreational facilities and transit options in these regions to cater to the weekend and holiday crowd.

By analyzing these spatial patterns, urban planners and policymakers can optimize transportation infrastructure, identify high demand areas for public transportation, and plan for congestion management during peak periods in a way that maximizes efficiency and efficiency. In addition, it provides valuable insights into commuter behavior, emphasizing the necessity to tailor transportation services within Singapore based on different time intervals and geographical areas.

Local Indicators of Spatial Association (LISA) Analysis

Spatial weights come in two primary forms:

  • contiguity weights and

  • distance-based weights.

In situations where hexagons don’t share borders or adjacency, like in our case, distance-based spatial weights are more suitable.

Contiguity-based weights vs distance-based weights

Contiguity-based weights focus exclusively on neighboring units that share a boundary. In contrast, distance-based weights extend this concept by accounting for the influence of both nearby and more distant neighbors. This approach provides a more comprehensive understanding of spatial interactions, especially when the effects of a phenomenon aren’t confined to immediate neighbors.

Why distance-based weights?

In this exercise, the commuting behaviors could influence areas beyond immediate neighbors, making distance-based weights a more suitable choice for the study. Distance-based weights are adept at capturing these broader, non-adjacent interactions, offering a more accurate representation of spatial relationships in such scenarios. Distance-based weights take into account how far two areas are separated, so areas that are closer together receive more weight than those that are farther away. This allows for a more accurate representation of how phenomena spread and interact within a complex environment.

We further delve into the three main sub-types:

  • fixed distance weights,

  • adaptive distance weights, and

  • inverse distance weights (IDW).

Why adaptive distance weights?

A highly urbanized and densely populated city-state was considered in this exercise when choosing the spatial weight type. In this exercise, hexagon cell are unevenly distributed. Thus, adaptive distance weights would be more effective. Adaptive distance weights ensure sure that each unit has a set number of neighbours, there by accomodating areas with varying densities and providing a more consistent and representative measure of local spatial relationships compared to the other 2 methods, which might overlook isolated units or disproportionately weight closer units.

Weekday morning peak

Computing adaptive distance weight matrix

wm_ad <- wkday_6to9_trips %>% 
  mutate(nb = st_knn(geometry,
                     k=8),
         wt = st_weights(nb),
               .before = 1)
! Polygon provided. Using point on surface.
wm_ad
Simple feature collection with 1492 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 3720.122 ymin: 26193.43 xmax: 48720.12 ymax: 53184.55
Projected CRS: SVY21 / Singapore TM
First 10 features:
                              nb
1      2, 4, 5, 8, 9, 12, 22, 23
2      1, 4, 5, 8, 9, 12, 22, 23
3    5, 6, 9, 10, 13, 14, 16, 17
4     1, 2, 8, 9, 12, 22, 23, 38
5    3, 6, 9, 10, 13, 16, 17, 24
6    3, 5, 7, 10, 14, 17, 18, 25
7  6, 11, 14, 15, 18, 19, 26, 32
8     1, 2, 4, 9, 12, 16, 22, 23
9   5, 8, 10, 12, 13, 16, 23, 24
10    3, 5, 6, 9, 13, 14, 17, 24
                                                       wt grid_id sum_trips
1  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      34       103
2  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      65        52
3  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      99        78
4  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     127       185
5  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     129      1116
6  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     130        53
7  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     131        60
8  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     159        64
9  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     160       251
10 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     161      1034
                         geometry
1  POLYGON ((3970.122 27925.48...
2  POLYGON ((4220.122 28358.49...
3  POLYGON ((4470.122 30523.55...
4  POLYGON ((4720.122 28358.49...
5  POLYGON ((4720.122 30090.54...
6  POLYGON ((4720.122 30956.57...
7  POLYGON ((4720.122 31822.59...
8  POLYGON ((4970.122 28791.5,...
9  POLYGON ((4970.122 29657.53...
10 POLYGON ((4970.122 30523.55...

Computing Global Moran’ I

In the code chunk below, global_moran() function is used to compute the Moran’s I value. Different from spdep package, the output is a tibble data.frame.

moranI <- global_moran(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)
glimpse(moranI)
List of 2
 $ I: num 0.2
 $ K: num 43.5

Moran’s I value (0.2) shows that areas with similar passenger trips are slightly more likely to be close to each other than by random chance.

Performing Global Moran’sI test

global_moran_test(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)

    Moran I test under randomisation

data:  x  
weights: listw    

Moran I statistic standard deviate = 16.376, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     0.1995910649     -0.0006706908      0.0001495491 

Performing Global Moran’I permutation test

Monte carlo simulation should be used to perform the statistical test. For sfdep, it is supported by globel_moran_perm()

Use set.seed() before performing simulation. This is to ensure that the computation is reproducible.

set.seed(1234)

global_moran_perm(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt,
                  nsim = 99)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 100 

statistic = 0.19959, observed rank = 100, p-value < 2.2e-16
alternative hypothesis: two.sided

The report above show that the p-value is smaller than alpha value of 0.05. Hence, reject the null hypothesis that the spatial patterns spatial independent. Because the Moran’s I statistics is greater than 0. We can infer the spatial distribution shows sign of clustering.

Computing local Moran’s I

Compute Local Moran’s I of passenger trips at hexagon level by using local_moran() of sfdep package.

lisa <- wm_ad %>% 
  mutate(local_moran = local_moran(
    sum_trips, nb, wt, nsim = 99),
         .before = 1) %>%
  unnest(local_moran)

Visualising local Moran’s I and p-value

Plot both local Moran’s I and p-value maps next to each other for effective comparison.

tmap_mode("view")
tmap mode set to interactive viewing
map1 <- tm_shape(lisa) +
  tm_fill("ii") + 
  tm_borders(alpha = 0.5) +
  tm_view() +
  tm_layout(main.title = "local Moran's I of GDPPC",
            main.title.size = 0.8)

map2 <- tm_shape(lisa) +
  tm_fill("p_ii_sim",
          breaks = c(0, 0.001, 0.01, 0.05, 1),
              labels = c("0.001", "0.01", "0.05", "Not sig")) + 
  tm_borders(alpha = 0.5) +
  tm_layout(main.title = "p-value of local Moran's I",
            main.title.size = 0.8)

tmap_arrange(map1, map2, ncol = 2)
Variable(s) "ii" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Visualising LISA map

lisa_sig <- lisa  %>%
  filter(p_ii_sim < 0.05)
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(lisa) +
  tm_polygons() +
  tm_borders(alpha = 0.5) +
tm_shape(lisa_sig) +
  tm_fill("mean") + 
  tm_borders(alpha = 0.4)
Warning: One tm layer group has duplicated layer types, which are omitted. To
draw multiple layers of the same type, use multiple layer groups (i.e. specify
tm_shape prior to each of them).

Weekday afternoon peak

Computing adaptive distance weight matrix

wm_ad <- wkday_5to8_trips %>% 
  mutate(nb = st_knn(geometry,
                     k=8),
         wt = st_weights(nb),
               .before = 1)
! Polygon provided. Using point on surface.
wm_ad
Simple feature collection with 1493 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 3720.122 ymin: 26193.43 xmax: 48720.12 ymax: 53184.55
Projected CRS: SVY21 / Singapore TM
First 10 features:
                              nb
1      2, 4, 5, 8, 9, 12, 22, 23
2      1, 4, 5, 8, 9, 12, 22, 23
3    5, 6, 9, 10, 13, 14, 16, 17
4     1, 2, 8, 9, 12, 22, 23, 38
5    3, 6, 9, 10, 13, 16, 17, 24
6    3, 5, 7, 10, 14, 17, 18, 25
7  6, 11, 14, 15, 18, 19, 26, 32
8     1, 2, 4, 9, 12, 16, 22, 23
9   5, 8, 10, 12, 13, 16, 23, 24
10    3, 5, 6, 9, 13, 14, 17, 24
                                                       wt grid_id sum_trips
1  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      34       390
2  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      65       114
3  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      99       291
4  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     127      1905
5  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     129      2899
6  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     130       241
7  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     131       368
8  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     159       296
9  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     160       297
10 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     161      2472
                         geometry
1  POLYGON ((3970.122 27925.48...
2  POLYGON ((4220.122 28358.49...
3  POLYGON ((4470.122 30523.55...
4  POLYGON ((4720.122 28358.49...
5  POLYGON ((4720.122 30090.54...
6  POLYGON ((4720.122 30956.57...
7  POLYGON ((4720.122 31822.59...
8  POLYGON ((4970.122 28791.5,...
9  POLYGON ((4970.122 29657.53...
10 POLYGON ((4970.122 30523.55...

Computing Global Moran’ I

In the code chunk below, global_moran() function is used to compute the Moran’s I value. Different from spdep package, the output is a tibble data.frame.

moranI <- global_moran(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)
glimpse(moranI)
List of 2
 $ I: num 0.0559
 $ K: num 88

Moran’s I value (0.0559) shows that areas with similar passenger trips are slightly more likely to be close to each other than by random chance. The chance is even higher compared to the weekday morning peak based on both Moran’s I value generated.

Performing Global Moran’sI test

global_moran_test(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)

    Moran I test under randomisation

data:  x  
weights: listw    

Moran I statistic standard deviate = 4.698, p-value = 1.314e-06
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     0.0558807460     -0.0006702413      0.0001448974 

Performing Global Moran’I permutation test

Monte carlo simulation should be used to perform the statistical test. For sfdep, it is supported by globel_moran_perm()

Use set.seed() before performing simulation. This is to ensure that the computation is reproducible.

set.seed(1234)

global_moran_perm(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt,
                  nsim = 99)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 100 

statistic = 0.055881, observed rank = 100, p-value < 2.2e-16
alternative hypothesis: two.sided

The report above show that the p-value is smaller than alpha value of 0.05. Hence, reject the null hypothesis that the spatial patterns spatial independent. Because the Moran’s I statistics is greater than 0. We can infer the spatial distribution shows sign of clustering.

Computing local Moran’s I

Compute Local Moran’s I of passenger trips at hexagon level by using local_moran() of sfdep package.

lisa <- wm_ad %>% 
  mutate(local_moran = local_moran(
    sum_trips, nb, wt, nsim = 99),
         .before = 1) %>%
  unnest(local_moran)

Visualising local Moran’s I and p-value

Plot both local Moran’s I and p-value maps next to each other for effective comparison.

tmap_mode("view")
tmap mode set to interactive viewing
map1 <- tm_shape(lisa) +
  tm_fill("ii") + 
  tm_borders(alpha = 0.5) +
  tm_view() +
  tm_layout(main.title = "local Moran's I of GDPPC",
            main.title.size = 0.8)

map2 <- tm_shape(lisa) +
  tm_fill("p_ii_sim",
          breaks = c(0, 0.001, 0.01, 0.05, 1),
              labels = c("0.001", "0.01", "0.05", "Not sig")) + 
  tm_borders(alpha = 0.5) +
  tm_layout(main.title = "p-value of local Moran's I",
            main.title.size = 0.8)

tmap_arrange(map1, map2, ncol = 2)
Variable(s) "ii" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Visualising LISA map

lisa_sig <- lisa  %>%
  filter(p_ii_sim < 0.05)
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(lisa) +
  tm_polygons() +
  tm_borders(alpha = 0.5) +
tm_shape(lisa_sig) +
  tm_fill("mean") + 
  tm_borders(alpha = 0.4)
Warning: One tm layer group has duplicated layer types, which are omitted. To
draw multiple layers of the same type, use multiple layer groups (i.e. specify
tm_shape prior to each of them).

Weekend/holiday morning peak

Computing adaptive distance weight matrix

wm_ad <- wkend_11to2_trips %>% 
  mutate(nb = st_knn(geometry,
                     k=8),
         wt = st_weights(nb),
               .before = 1)
! Polygon provided. Using point on surface.
wm_ad
Simple feature collection with 1502 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 3927.939 ymin: 26193.43 xmax: 48677.94 ymax: 53184.55
Projected CRS: SVY21 / Singapore TM
First 10 features:
                                nb
3       2, 4, 5, 9, 10, 13, 23, 24
3.1     1, 4, 5, 9, 10, 13, 23, 24
37    6, 7, 10, 11, 14, 15, 17, 18
65      1, 2, 5, 9, 10, 13, 23, 24
65.1    1, 2, 4, 9, 10, 13, 23, 24
67    3, 7, 10, 11, 14, 17, 18, 25
68     3, 6, 8, 11, 15, 18, 19, 26
69   7, 12, 15, 16, 19, 20, 27, 33
97      1, 2, 4, 5, 10, 13, 23, 24
98    6, 9, 11, 13, 14, 17, 24, 25
                                                         wt grid_id sum_trips
3    0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125       3        26
3.1  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125       3       213
37   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      37        52
65   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      65        26
65.1 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      65       213
67   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      67        88
68   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      68        76
69   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      69        45
97   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      97        21
98   0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      98       406
                           geometry
3    POLYGON ((4177.939 28358.49...
3.1  POLYGON ((4177.939 28358.49...
37   POLYGON ((4427.939 30523.55...
65   POLYGON ((4677.939 28358.49...
65.1 POLYGON ((4677.939 28358.49...
67   POLYGON ((4677.939 30090.54...
68   POLYGON ((4677.939 30956.57...
69   POLYGON ((4677.939 31822.59...
97   POLYGON ((4927.939 28791.5,...
98   POLYGON ((4927.939 29657.53...

Computing Global Moran’ I

In the code chunk below, global_moran() function is used to compute the Moran’s I value. Different from spdep package, the output is a tibble data.frame.

moranI <- global_moran(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)
glimpse(moranI)
List of 2
 $ I: num 0.162
 $ K: num 52.9

Moran’s I value (0.162) shows that areas with similar passenger trips are slightly more likely to be close to each other than by random chance.

Performing Global Moran’sI test

global_moran_test(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)

    Moran I test under randomisation

data:  x  
weights: listw    

Moran I statistic standard deviate = 13.391, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     0.1619465470     -0.0006662225      0.0001474728 

Performing Global Moran’I permutation test

Monte carlo simulation should be used to perform the statistical test. For sfdep, it is supported by globel_moran_perm()

Use set.seed() before performing simulation. This is to ensure that the computation is reproducible.

set.seed(1234)

global_moran_perm(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt,
                  nsim = 99)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 100 

statistic = 0.16195, observed rank = 100, p-value < 2.2e-16
alternative hypothesis: two.sided

The report above show that the p-value is smaller than alpha value of 0.05. Hence, reject the null hypothesis that the spatial patterns spatial independent. Because the Moran’s I statistics is greater than 0. We can infer the spatial distribution shows sign of clustering.

Computing local Moran’s I

Compute Local Moran’s I of passenger trips at hexagon level by using local_moran() of sfdep package.

lisa <- wm_ad %>% 
  mutate(local_moran = local_moran(
    sum_trips, nb, wt, nsim = 99),
         .before = 1) %>%
  unnest(local_moran)

Visualising local Moran’s I and p-value

Plot both local Moran’s I and p-value maps next to each other for effective comparison.

tmap_mode("view")
tmap mode set to interactive viewing
map1 <- tm_shape(lisa) +
  tm_fill("ii") + 
  tm_borders(alpha = 0.5) +
  tm_view() +
  tm_layout(main.title = "local Moran's I of GDPPC",
            main.title.size = 0.8)

map2 <- tm_shape(lisa) +
  tm_fill("p_ii_sim",
          breaks = c(0, 0.001, 0.01, 0.05, 1),
              labels = c("0.001", "0.01", "0.05", "Not sig")) + 
  tm_borders(alpha = 0.5) +
  tm_layout(main.title = "p-value of local Moran's I",
            main.title.size = 0.8)

tmap_arrange(map1, map2, ncol = 2)
Variable(s) "ii" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Visualising LISA map

lisa_sig <- lisa  %>%
  filter(p_ii_sim < 0.05)
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(lisa) +
  tm_polygons() +
  tm_borders(alpha = 0.5) +
tm_shape(lisa_sig) +
  tm_fill("mean") + 
  tm_borders(alpha = 0.4)
Warning: One tm layer group has duplicated layer types, which are omitted. To
draw multiple layers of the same type, use multiple layer groups (i.e. specify
tm_shape prior to each of them).

Weekend/holiday evening peak

Computing adaptive distance weight matrix

wm_ad <- wkend_4to7_trips %>% 
  mutate(nb = st_knn(geometry,
                     k=8),
         wt = st_weights(nb),
               .before = 1)
! Polygon provided. Using point on surface.
wm_ad
Simple feature collection with 1490 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 3720.122 ymin: 26193.43 xmax: 48720.12 ymax: 53184.55
Projected CRS: SVY21 / Singapore TM
First 10 features:
                              nb
1      2, 4, 5, 8, 9, 12, 22, 23
2      1, 4, 5, 8, 9, 12, 22, 23
3    5, 6, 9, 10, 13, 14, 16, 17
4     1, 2, 8, 9, 12, 22, 23, 38
5    3, 6, 9, 10, 13, 16, 17, 24
6    3, 5, 7, 10, 14, 17, 18, 25
7  6, 11, 14, 15, 18, 19, 26, 32
8     1, 2, 4, 9, 12, 16, 22, 23
9   5, 8, 10, 12, 13, 16, 23, 24
10    3, 5, 6, 9, 13, 14, 17, 24
                                                       wt grid_id sum_trips
1  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      34        56
2  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      65        14
3  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125      99       100
4  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     127       346
5  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     129       634
6  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     130        55
7  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     131        49
8  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     159        53
9  0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     160       132
10 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125, 0.125     161      1030
                         geometry
1  POLYGON ((3970.122 27925.48...
2  POLYGON ((4220.122 28358.49...
3  POLYGON ((4470.122 30523.55...
4  POLYGON ((4720.122 28358.49...
5  POLYGON ((4720.122 30090.54...
6  POLYGON ((4720.122 30956.57...
7  POLYGON ((4720.122 31822.59...
8  POLYGON ((4970.122 28791.5,...
9  POLYGON ((4970.122 29657.53...
10 POLYGON ((4970.122 30523.55...

Computing Global Moran’ I

In the code chunk below, global_moran() function is used to compute the Moran’s I value. Different from spdep package, the output is a tibble data.frame.

moranI <- global_moran(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)
glimpse(moranI)
List of 2
 $ I: num 0.0975
 $ K: num 71.8

Moran’s I value (0.0975) shows that areas with similar passenger trips are slightly more likely to be close to each other than by random chance. The chance is even higher compared to the weekend/holiday morning peak based on both Moran’s I value generated.

Performing Global Moran’sI test

global_moran_test(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt)

    Moran I test under randomisation

data:  x  
weights: listw    

Moran I statistic standard deviate = 8.1, p-value = 2.748e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     0.0974587816     -0.0006715917      0.0001467698 

Performing Global Moran’I permutation test

Monte carlo simulation should be used to perform the statistical test. For sfdep, it is supported by globel_moran_perm()

Use set.seed() before performing simulation. This is to ensure that the computation is reproducible.

set.seed(1234)

global_moran_perm(wm_ad$sum_trips,
                       wm_ad$nb,
                       wm_ad$wt,
                  nsim = 99)

    Monte-Carlo simulation of Moran I

data:  x 
weights: listw  
number of simulations + 1: 100 

statistic = 0.097459, observed rank = 100, p-value < 2.2e-16
alternative hypothesis: two.sided

The report above show that the p-value is smaller than alpha value of 0.05. Hence, reject the null hypothesis that the spatial patterns spatial independent. Because the Moran’s I statistics is greater than 0. We can infer the spatial distribution shows sign of clustering.

Computing local Moran’s I

Compute Local Moran’s I of passenger trips at hexagon level by using local_moran() of sfdep package.

lisa <- wm_ad %>% 
  mutate(local_moran = local_moran(
    sum_trips, nb, wt, nsim = 99),
         .before = 1) %>%
  unnest(local_moran)

Visualising local Moran’s I and p-value

Plot both local Moran’s I and p-value maps next to each other for effective comparison.

tmap_mode("view")
tmap mode set to interactive viewing
map1 <- tm_shape(lisa) +
  tm_fill("ii") + 
  tm_borders(alpha = 0.5) +
  tm_view() +
  tm_layout(main.title = "local Moran's I of GDPPC",
            main.title.size = 0.8)

map2 <- tm_shape(lisa) +
  tm_fill("p_ii_sim",
          breaks = c(0, 0.001, 0.01, 0.05, 1),
              labels = c("0.001", "0.01", "0.05", "Not sig")) + 
  tm_borders(alpha = 0.5) +
  tm_layout(main.title = "p-value of local Moran's I",
            main.title.size = 0.8)

tmap_arrange(map1, map2, ncol = 2)
Variable(s) "ii" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Visualising LISA map

lisa_sig <- lisa  %>%
  filter(p_ii_sim < 0.05)
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(lisa) +
  tm_polygons() +
  tm_borders(alpha = 0.5) +
tm_shape(lisa_sig) +
  tm_fill("mean") + 
  tm_borders(alpha = 0.4)
Warning: One tm layer group has duplicated layer types, which are omitted. To
draw multiple layers of the same type, use multiple layer groups (i.e. specify
tm_shape prior to each of them).

Observation

The Local Indicators of Spatial Association (LISA) highlight active areas, particularly those classified as high-high regions, where many people travel, and the surrounding areas are as busy. In these high-activity zones, large crowds gather for work-related purposes or daily activities at key destinations, such as shopping centers or office complexes.

This analysis illuminates passenger movement patterns, showing that they are not random, but rather concentrated at specific locations. The discovery of these concentrated areas, or hot spots, holds immense value for urban planners and transportation authorities. It provides an analytical tool to grasp and evaluate the demand for transportation services. The government and planners can make informed transportation planning decisions by identifying areas with high passenger traffic. In addition to tailoring transportation services to meet demand, optimizing routes, and allocating infrastructure resources efficiently, such planning also improves urban mobility and accessibility.